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Abstract 

We suppose that a Levy process is observed at discrete time points. A 
rather general construction of minimum-distance estimators is shown to 
give consistent estimators of the Levy-Khinchine characteristics as the 
number of observations tends to infinity, keeping the observation distance 
fixed. For a specific C^-criterion this estimator is rate-optimal. The con- 
nection with dcconvolution and inverse problems is explained. A key step 
in the proof is a uniform control on the deviations of the empirical charac- 
teristic function on the whole real line. 
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1. Introduction 

Levy processes form the fundamental building block for stochastic continuous-time 
models with jumps. There is an important trend using Levy models in finance, see Cont 
and Tankov (2004), but also many recent models in physics or biology rely on Levy pro- 
cesses. We consider here the problem of estimating the Levy-Khintchine characteristics 
from time-discrete observations of a Levy process. Since these characteristics involve the 
Levy measure (or jump measure) and we do not want to impose a parametric model, we 
face a nonparametric estimation problem. 

When the Levy process {Xt)t^o is observed at high frequency, at times (tj)j=o,...,n 
with maxj(fj — tj-i) small, then a large increment Xt- — Xt-_^ indicates that a jump 
occurred between time and tj. Based on this insight and the continuous-time observa- 
tion analogue, nonparametric inference for Levy processes from high-frequency data has 
been considered by Basawa and Brockwell (1982), Figueroa-Lopez and Houdre (2006) and 
Nishiyama (2007). For low-frequency observations, however, we cannot be sure to what 
extent the increment Xj. — Xt^_^ is due to one or several jumps or just to the Brownian 
motion part of the Levy process. The only way to draw inference is to use that the in- 
crements form independent realisations of infinitely divisible probability distributions. We 
shall assume that we dispose of equidistant observations at U = zA, z = 0, . . . , n, and con- 
sider the asymptotic behaviour of estimators for re — > oo and A > fixed. This can be cast 
into the classical framework of i.i.d. observations (XjA ~ ^(i-i)A)i=i,. -," from an infinitely 
divisible distribution. A natural question in this framework is to estimate the underlying 
Levy-Khintchine characteristics. In this general setting we are only aware of the work by 
Watteel and Kulperger (2003) who propose and implement an approach for estimating the 
jump distribution by a fixed spectral cut-off procedure, which is related to the pilot esti- 
mator in Section |5] below. In the special case of compound Poisson processes the problem 
of estimating the jump density is known as decompounding, see van Es, Gugushvili, and 
Spreij (2007), Gugushvili (2007) and the references therein. For parametric inference under 
the assumption of a stable law see e.g. Feuerverger and McDunnough (1981b). A related 
low-frequency problem for the canonical function in Levy-Ornstein-Uhlenbeck processes 
has been considered by Jongbloed, van der Meulen, and van der Vaart (2005), where a 
consistent estimator has been constructed. 

In Section [2] we recall basic facts about Levy processes and prepare the idea of 
minimum-distance estimators based on the empirical characteristic function. Under very 
general conditions we then show in Section [3] consistency of these estimators for the Levy- 
Khintchine characteristics. The only way to achieve this is to merge the diffusion coeffi- 
cient C7^ and the Levy measure to a single quantity v^j^ which is a finite Borel measure, 
and to consider weak convergence of estimators of u^j . In Section |4] we construct a rate- 
optimal estimator using a minimum-distance fit, based on a C^-criterion for the empirical 



characteristic function. A fundamental tool is Theorem 4.1 which gives a uniform con- 
trol on the deviations of the empirical characteristic function on the whole real line and 
may be of independent interest. The optimal rates of convergence depend on the decay 
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of the characteristic function as in deconvolution problems. Interestingly, our estimator 
attains the optimal rates without knowing this decay behaviour and without any further 
regularisation parameter. In Section [5] we briefly discuss the implementation of the esti- 
mator, using a two-step procedure, and show a typical numerical example. Most proofs 
are postponed to Section [6} 

2. Basic notions, assumptions, and a few simple facts 

We assume that we observe a one-dimensional Levy process {Xt)t^o at equidistant 
time points = to < < ■ ■ ■ < tn- Such a process is characterized by its characteristic 
function 

ip{u,t;b,a,u) := E[exp(iiiXj)] = exp(t ^'(u; 6, o", i/)), u G M, 

where 

2 I. 

^{u) = = iub - + / (^e*"^ - 1 - ^) u{dx). 

The triplet (6, a, v) is called Levy-Khint chine characteristic or characteristic triplet with 
drift-like part 6 G M, volatility o" ^ and jump measure ly, which is a non-negative a- 
finite measure on (M, B) with J v{dx) < oo. The function ^ is called characteristic 
exponent or cumulant function. 

For reasons explained below, we introduce a measure v'a by 

h>a{dx) = a'^6o{dx) + ^^uidx), 

1 -|- 

where Sq denotes the point measure in zero. This gives another representation of ^ in 
terms of 6 S M and the finite Borel measure as 

^(n) = i/^) = lub + / ^ vJdx). 

Jr 

Here we have used the continuous extension of the integrand at x = 0, which evalu- 
ates to —v?/2. Let -fftp^ denote the probability distribution with characteristic function 
93(», t; 6, Per) = exp(t ^'(•; 6, z^o-)) for some fixed t > 0. Writing fin =^ /i for weak conver- 
gence of the finite Borel measures to the finite Borel measure ^ on (M, B), the following 
well-known result will be essential in the sequel (Theorem VII. 2. 9 and Remark VII. 2. 10 
in Jacod and Shiryaev (2002) or Theorem 19.1 in Gnedenko and Kolmogorov (1968)). 

Proposition 2.1. The convergence Pj^ =^ ^Iv^ takes place if and only if b. 
and n =^ i^o- • 



By the scaling properties of Levy processes there is no loss in generality when we 
suppose tk = k, k = 0, . . . ,n. We write ip{u; 6, Va) short for (p{u, l;b, Ua)- Let us introduce 
the empirical characteristic function of the increments 

n 

^„(u) := - y e™(^'-^*-i), MGR. 
t=i 



3 



Since these increments are independent and identically distributed it follows from the 
Glivenko-Cantelli theorem that 

(2.1) P^^(ipn{u) v{u;b,i?^) VuSr) = 1. 

We will consider minimum distance fits, that is, we intend to choose bn and i>o-,n such 
that, for an appropriate metric d, 

(2.2) d(^„, (/?(.; 6n,^CT,n)) = inf d{(pn,'f{»]b,v„)). 



Here A^(M) denotes the space of all finite Borel measures on (M, B). Our basic motivation 
for this estimation procedure arises from the fact that an exact maximum likelihood esti- 
mator is not feasible since there is in general no closed form expression for the probability 
density of the observations available. Moreover, it is well-known that methods based on 
the empirical characteristic function can be asymptotically efficient; see Feuerverger and 



McDunnough (1981a, 1981b). Since we are not sure that the infimum in (2.2) is always 
obtained, we take a sequence of positive reals (5n)neN with (5„ ^ as n ^ oo and choose 
bn and v^^n such that 

(2.3) d{(pn,^{']bn,^a,n)) ^ inf d{i^n,'f{»]b,V„)) + dn- 



For the metric d, we assume that 

(2.4) lim d(ipn,^{»;b,i>a)) =0 Pj.^ -almost surely 

and that the following implication holds: 

lmin^ood{ip{»;bn,l'a,n),V>{»',b,i^a)) = 



(2.5) 



lim.„^oo 'fiu; bn, v^,n) du = ip{u; b, v^) du Vs, t G M. 

A simple example of such a distance is given by the weighted L^-norms, 

i/p 



d{ipi,ip2) = ( \(pi{u) - ip2{u)\Pw{u) du) 

^ J — oo ' 

where p ^ 1 and li; : M — > (0, oo) is a continuous weight function with w{u) du < oo. 



Then Assumption (2.4) follows by dominated convergence from the convergence result 



(2.1), while Assumption (2.5) is immediate. 



3. Consistency 

We derive from the triangle inequality, the definition of the minimum-distance estima- 
tor and Assumption (|2.4|) that 



d{ip{»;bn,i>a,n),v{';b,iya)) ^ d(v3(.; 6„,15^,„), (^„) -h d(^„, (/?(.; 6, z^^)) 
(3.1) ^ 2d{(pn,^i»;b,i>„)) + 6n 
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By Assumption (2.5) this implies for the integrated characteristic function that 

ft ^ 



(3.2) 



ip{u]bn,Va,n)du 



(p{u; b, Va) du Vs, t G 



1. 



By Theorem 6.3.3 in Chung (1974, page 163), we obtain from (3.2) that 



where ' — denotes vague convergence to a possibly defective (that is, with a mass less 
than 1) measure. However, since this vague limit is a probability measure, it turns out 
that the mode of convergence is actually the weak one, that is. 



(3.3) 



Pi 



As an immediate consequence of Equation (3.3) and Proposition 2.1 above we obtain 
the following consistency result for the parameters of the Levy process: 



Theorem 3.1. If the distance d satisfies properties (2.4) and (2.5), then the minimum 
distance fit (bn,i^a,n) is a strongly consistent estimator, that is, with probability one we 
have for n ^ oo 



and 



Remark 3.2. Without further assumptions we cannot estimate the diffusion parameter a 
in a uniformly consistent way. We have for example that the stable law with characteristic 
function (/7Q.(ii) = e"'"'"/^ converges for a | 2 to the standard normal law (a = 2) in 
total variation norm: by Scheffe's Lemma it suffices to show pointwise convergence of the 
density functions, which follows from the L^-convergence of the characteristic functions. 
Hence, for n observations no test can separate the hypotheses Hq : a = 2 and Hi : a < 2. 
Since we have a = 1 for a = 2 and o" = for a < 2, this implies for the estimation problem 
uniform inconsistency in the following sense: 

lim sup inf sup (|5n — ^ 1/2) > 0, 

where the infimum is taken over all estimators based on n observations. Thus, from a 
statistical perspective the estimation of the volatility a makes no sense, unless we restrict 
the class of Levy processes under consideration, e.g. to the finite intensity case as in 
Belomestny and Reifi (2006). 

The practical implementation of the minimum distance method raises naturally the 
question of computational feasibility. It is certainly not possible to compute i>cr^n by an 
optimisation over the full set A1(M). In our simulations, for example, we approximate the 
measure by measures with step-wise constant densities. To assess the effect of such an 
approximation, consider a sequence of subsets Ai^^^ C A^(R) with the density property 



that there exist measures z/*^"^ G A^(") with i/*-"-* =^ v^^ as n ^ oo. The definition from 
(2.3) is now replaced by 

d{(pn,^{»]bn,^a,n)) ^ inf 6, I^^)) + 5„. 



We obtain instead of (3.1) that 

— ^ ^,p,-a-s. 



Hence, we obtain in complete analogy to Theorem 3.1 that with probability one for ?i — > cxd 

In and d„„ i)^. 



Given the existence of certain moments for Pbp^, we could also search our minimum- 
distance estimator in the class of those parameter values that fit the empirical moments. 
Using a similar error decomposition and the consistency of the empirical moments, this 
approach will also yield consistent estimators under mild conditions on the distance d. 

4. A RATE-OPTIMAL ESTIMATOR 

4.1. The construction. In this section we intend to devise estimators which attain opti- 
mal rates of convergence. We henceforth restrict the class of Levy processes to those with 
finite second moments. This is equivalent to requiring that the Levy measure satisfies 
J x^v^dx) < oo. In this case the following reparametrisation of the characteristic exponent 
is much more convenient: 

2 f 

^'(n; 6, I/) = iub —u^ + / (e*"^ - 1 - iux)v{dx), 

2 Jm. 

where the parameter b = b+ j^[x — j^^)v[dx) denotes now indeed the mean trend because 
of = —i{p'{{)) = b. Let us mention that this is the original Kolmogorov canonical 

representation of a Levy process (Kolmogorov 1932), the historial background of which 
is nicely exposed by Mainardi and Rogosin (2006). Instead of i>o-, we consider the finite 
measure defined by 

Ufjlydx) = a'^6o{dx) + x'^i>{dx), 

which allows the nice identity Var(Xi) = —(^"(0) -|-(/9'(0)^ = f(j(R). From now on, we shall 
express the characteristic exponent in terms of (6, Va): 

^{u) = ^{u;b,Ua) = iub + / k Ufj{dx). 

Jr X 

While b can be easily estimated by ^ X]"=i(^t ~ ^t-i) = Xn/n, the construction of 
an optimal nonparametric estimator of t'o- requires more work. Before we start with our 
search for optimal rates of convergence for estimators of z/g-, we have to decide about 
an appropriate metric to measure the deviation of any potential estimator V^^n from its 
target v^- 
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The parameter u^y lies in the space of finite Borel measures, which is naturally equipped 
with the total variation norm. As wc have seen above in the consistent estimation problem 
for c7, this topology is too strong here. Moreover, we are usually not interested in the 
problem of estimating itself, but rather in estimating integrals J / dv^r for certain inte- 
grands /. In mathematical finance for example, the so-called A in the quadratic hedging 
approach requires calculating J i/^(^(iz), where C{t, S) denotes the option 

price at time t and S the corresponding stock price, cf. Proposition 10.5 in Cont and 
Tankov (2004). This is why we choose to measure the performance of our estimator by 
metrizing weak convergence with certain classes F of continuous test functions /: 



l{l^a,n, ^a) = SUp 



Note that for any class F of uniformly bounded, equicontinuous functions consistency 
with respect to weak convergence implies liva,n-,i^a) (Dudley 1989, Cor. 11.3.4). For 
instance, the bounded Lipschitz metric is generated by the test functions of Lipschitz norm 
less than one. 

Let us introduce the Fourier transform for functions / G L^(M) or measures /x G A1(M) 

by 

Note that we have by Parseval's equality 



J fdi^a = ^ J J='f{u)J='iyaiu)du, 



provided J^f G L^(M) (Katznelson 1976, Theorem VI. 2. 2). Estimation of i/^r turns out to 
be particularly transparent when we employ the fact that 

d'^ f e'"^ — 1 — iux 
*"(^) = d^^J -2 ^-(^^) = -^M^)^ 

and consequently 

(4.1) ^„,(„) = __log(^(„)).^_5ej^. 

Recall that / x'^v{dx) < oo implies IE[X^^] < oo and hence (p ^ C'^. Moreover, in order 
to recover * from (p we use the distinguished logarithm of the complex-valued function 
u I— (p{u), which is required to ensure log((p(0)) = and continuity oi u ^ \og{if>{u)), 
cf. Cont and Tankov (2004). This formula indicates that estimating is strongly related 
to estimating in a C^-sense. Before we study rates of convergence, we need to investigate 
uniform rates of convergence of the empirical characteristic function (pn and its derivatives. 

4.2. Estimating the characteristic function. For i.i.d. random variables {Ztjten^ de- 
note by 

n 

Cn{u) := n-^/2^(e^"^* - E[e^"^i]) 
t=i 
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the normalized characteristic function process. Furthermore, denote by Cn its kih deriv- 
ative which exists if E|Zi|^ < cxd. For an appropriate weight function w : M — > [0, oo), we 
consider 

nC^'^h^M :=Esup{|CW(„)mn)}. 
For every ^ we have the following general result. 

Theorem 4.1. Suppose that {Zt)teN cl^g i.i.d. random variables with K\Zi\'^'^^'^ < oo for 
some 7 > and let the weight function be defined as w{u) = (log(e + |M|))-i/2-'5 /or some 
6>0. Then 

supE||C(^)|U^(^) < oo. 



Its proof is given in Section 6.1 Let us mention that the logarithmic decay of the 
weight function w is in accordance with the well known result that (pn ^ a.s. holds 
uniformly on intervals [— r„,r„] whenever log(T„)/n — > 0, cf. Csorgo and Totik (1983). 



4.3. Upper risk bounds. In view of (4.1 ) and Theorem 4.1 we define our estimators of 



b and v„ by a minimum distance fit based on a weighted C -norm. Defining 

we choose the estimators 6„ G M and z?o-,n G A^( 
(4.2) d^'^Uip{,;bn,u^^n),$n) ^ inf 



such that 

d^^) ((p{*;b,Ucr),^r 



where (5„ ^ as n — > oo. We verify by Theorem 4.1 that d^^^ satisfies Assumptions (2.4) 



and (2.5), hence. Theorem 3.1 gives immediately a consistency result. Moreover, with the 



choice 6n = 0(n"^/^) these estimators will turn out to be rate-optimal. 

While b can always be estimated at rate rates of convergence of J f dDcr^n as an 

estimator of J f dv^ depend both on the smoothness of / and on the decay of |</?('u)| as 
\u\ oo. For the function /, we will assume that it belongs to the class 



/: j{l + \u\y\J^f{u)\du ^l^ 



for some s ^ 0. Note that j\Tf{u)\du ^ 1 implies by the Riemann-Lebesgue Lemma 
that / is continuous with ||/||oo ^ 1- By Fourier theory the condition f £ Fg is slightly 
stronger than requiring f £ with ||/||c^ ^ 1 foi' a suitable norming of C*. We therefore 
introduce a loss function for an estimator Ji of the finite measure fi by 



4(/i,/i) := sup 

feFs 



fdjl 



fdfi 



Note that by duality the loss is can be interpreted as a negative smoothness norm of 
order —s. 



The faster decays, the more difficult it wih be to estimate u^. We consider in 

particular the following three cases: 

(a) Gaussian part 

If a"^ > 0, then the characteristic function ip has Gaussian tails, i.e. 

log|vj(u)| = Re (log (^(n)) = -cjV/2(1 + o(n)). 



as u 



oo. 



(To see this, note that F(x,u) := (e™^ — 1 — iux)/{ux)'^ is uniformly bounded 
with lim|„|_^o^ F{x^ n) = for x 7^ such that by dominated convergence 
lim|„|^oo Jjjj F{x, n)x^ v{dx) = and thus log ^{u) = —a'^v?/2 + o(u^).) 

(b) Exponential decay 

Here the characteristic function (p decays at most exponentially, i.e. for some a > 0, 
C>0, 

\ip{u)\ ^ C7e-"l"l, for ah -u G M. 

Examples of distributions with this property include normal inverse Gaussian 
(Cont and Tankov 2004, page 117), and generalized tempered stable distributions 
(Cont and Tankov 2004, page 122). 

(c) Polynomial decay 

In this case the characteristic function satisfies, for some /? ^ 0, C > 0, 

\<p{u)\ ^ C(l + |n|)-^, for all n G M. 

Typical examples for this are the compound Poisson distribution, the gamma dis- 
tribution, the variance gamma distribution and the generalized hyperbolic distri- 
bution (Cont and Tankov 2004, pages 75, 116, 117, 127). 



The proof of the following main theorem is postponed to Section 6.2 



Theorem 4.2. Suppose that Efe^j,^|Xi|*^^"'^ < co for some 7 > 0. We choose the weight 
function w as w{u) = (log(e-|- Ittl))"^/^^*^, where 6 is any positive number. The estimators 
bn and Ua^n of b and Ua, respectively, are chosen according to (4-2) with 6n = 0{n^^^'^). 
Then 

^b,uSn -b\ = 0(n-i/2) 

and for any s > 



^si'^a,n,'^a) = Op^ 



n 



-1/2. 



«e[0,?„] \ w{u)\ip{u;b,v^)\ 



fl + n)2- 



where 



t/r, := inf < u > : 



> 1 



w{u)\ip{u]b,v„)\ 

The constants in the risk bounds depend continuously on \b\ and fo-(M). In the specific 
cases we obtain the following rates of convergence for is(J'a,mt^a) in Pb,u^ -probability: 

(a) Gaussian part: (logn)"'^/^ 

(b) Exponential decay: (logn)~* 



(c) Polynomial decay of order /3 ^ 0: [(log n)^/^+^''n i/^]*//^ Vn 
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Remark 4.3. The results are presented for convergence in probability, but the proof im- 
mediately yields convergence of moments of order 1/2 of the loss in cases (a), (b), cf. 
Equation (6.4). Higher moments are achieved whenever the order of the moment bound 



in Theorem 14.11 can be increased. 

4.4. Lower risk bounds. We prove that the rates of convergence obtained in Theorem 



4.2 for cases (a), (b), (c) are optimal, at least up to a logarithmic factor in the latter case. 



The proof in Section |6.3| can be naturally generalized to cover further decay scenarios of 
the characteristic function. 

Theorem 4.4. For C,C > large enough and for any a > 0, f3 ^ introduce the 
following nonparametric classes of v„: 

A{C,a) := M{M) i/<,(M) ^ c} (a>{)), 

B{C,a) := {z^^ G 7W(M) I i/^(M) ^ C, \ip{u)\ ^ Ce""!"!} (a = 0), 

C(C,(5,/3) := {u^ G MiR) | i/,(M) ^ C, \ip{u)\ ^ C-\l + \u\)-^^ (a = 0). 

Then we obtain for some fixed 6 G M and for any s > the following minimax lower 
bounds, where T'a,n denotes any estimator of based on n observations: 

(a) 3e > : liminf inf sup Pb,u„ ((log n)"/24(i7<^,„, v^) > e) > 0, 

(b) 3e > : liminf inf sup Pb^u„ {{lognY ls{va,n, Va) > e) > 0, 

(c) 3e > : liminf inf sup P^,^^ (n("/2/3)A{i/2)^^(~^^^^ > e) > 0. 



4.5. Discussion. The convergence rates for Va,n can be understood in analogy with a 
deconvolution problem where the Fourier transform of the error density decays like the 
characteristic function ip in our case, see e.g. Fan (1991). The interesting point here is 
that this decay property is not assumed to be known and depends on the parameters to 
be estimated. At first sight, it is rather surprising that our minimum distance estimator 
adapts automatically to the decay of 99, even for the whole range of loss functions s > 0. 
This is due to the fact that the noise level in the empirical characteristic function (pn is of 
the same size for different frequencies and this is where we fit our estimator. In contrast, 
when fitting the characteristic exponent which is more attractive from a computational 
point of view and for example advocated in Jongbloed, van der Meulen, and van der Vaart 
(2005), we face a highly heteroskedastic noise level in log(^„(ti)) governed by |(/?(ii)|~^ 
because of log((^„(u)) — ^'(u) [(pniu) — ip{u)) / (p{u) . 

Another point of view on our estimation problem is that we want to estimate the linear 
functional / f dVfj based on an inverse problem setting for estimating z^o-- In an abstract 
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Hilbert scale context, adaptive estimation for this has been considered by Goldenshluger 
and Pereverzev (2003) and their rate for the polynomially ill-posed case reads in our 
notation (n/ log(n))~('""'''')/(^''"'"^^) V with r the regularity of s the regularity of 

/ and (3 the degree of ill-posedness. In our case, we measure the regularity s of / in the 
Fourier domain by an L^-criterion such that a dual L°°-criterion for the regularity of t'o- 
yields r = because H^t'o-Hoo is finite. Hence, the rate (n/ log(n))~*/^'^ Vn""*^/^, up to the 
logarithmic factor of power 6, obtained in case (c) of Theorem |4.2[ confirms this analogy. 
We suspect that the gap by a logarithmic factor in the polynomial case between our upper 
and lower bound is mainly due to a suboptimal lower bound, because ig can be expressed 
in the Fourier domain via 



= sup I T f{u)J^{Jl — p){u) du = sup(l + |u|) — 
/eFs J neM 

giving a supremum-type norm. 

It is certainly remarkable that no regularisation parameter is involved in our estimation 
procedure which becomes more intuitive by noticing that the results of Section|3]imply con- 
sistency already for s = 0. On the other hand, better rates of convergence can be obtained 
when we restrict the model to measures Va which have a regular Lebesgue density g^y. A 
natural plug-in approach yields the kernel-type estimator ga,n,h{x) := Kh * z?o-,n(x), con- 
volving the minimum-distance estimator with a smooth kernel K]^ of bandwidth /i > 0. 
Noting that / fga,n,h = /(/ * Kh) du^, jj, we infer that the bound on the stochastic error 

f{9a,n,h - Kh*ga) = if * Kh)d(j)a,n " I^a) 



is controlled by the regularity of / * Kh- To be more specific, consider a function / with 
|J^/(n)| X (l-h|n|)~*~^ (e.g. f{x) = e"!"'! with s = 1), suppose supu{'^ + \u\Y\J^ga{u)\ < oo 
for r > and assume polynomial decay of order (3 ^ s oi the characteristic function. Then 
J{1 + \u\)l^\J='f{u)J='Kh{u)\ du X h'-f^ holds such that ch-'+^f * Kh lies in F^, c> some 
small constant, and Theorem |4.2| implies that 

/ f{9.,n,H-Kh*g.)\ = Op [h^-^n-'/\logn)y'+'') . 

Together with an easy bias estimate of order h'^'^'^ this yields for the estimation error 
1/ f(jja,n,h ~ 9<t)\ up to logarithmic factors the rate provided the band- 

width is chosen in an optimal way. We conclude that our results also allow to obtain risk 
bounds under smoothness restrictions, which are coherent with the abstract results in 
Goldenshluger and Pereverzev (2003) . The rates should also be compared with the case of 
continuous-time observations on [0,T], where Figueroa-Lopez and Houdre (2006) obtained 
the classical nonparametric rate j'-'f /(2f+i) fgj. estimating g^j on a bounded interval. 

5. Implementation 

Although the main focus of our work is theoretical, we point out how the minimum 
distance estimator can be implemented and show a numerical example. The main compu- 
tational problem is that the procedure requires to minimize a nonlinear functional over 
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the space of all finite measures. One possibility is to use a global optimisation procedure, 
e.g. based on simulated annealing, cf. Hall and Yao (2003) for an application to minimum- 
distance fits based on characteristic functions. Here we shall look for a good preliminary 
estimator and minimize the -criterion locally around this pilot estimator, which turns 
out to be more stable in simulations than global optimisation routines. 

We use the identification formula ( |4.1[ ) to build a first-stage plug-in estimator (6ri) ^a,n)- 
While the mean b will be easily estimated by 

1 " 

bn := - y^iXt - Xt-i) = Xn/n, 
t=i 

we have to be more careful with an estimator of v^. Since J-V(j{u) = tp"{u)/ip{u) — 
{if' {u) / ip{u))'^ one might be tempted to estimate its Fourier transform just by plugging in 
the empirical characteristic function (fn for ip. It turns out, however, that the occurrence 
of (pniu) in the denominator might have unfavorable effects, particularly if Iv'l'u)! is small. 
To get some intuition for a possible remedy, consider the problem of estimating l/^p{u). 
l/(pn{u) is certainly a good estimator as long as is not too small. On the other 

hand, since the noise level of (pn{u) is 0(n^^/^) we should no longer rely on if 
(pn{u) = 0(72^-*^/^). To take this into account, one can use I||;^^(-^)|^^„_i/2|/(^n(^i) as an 
estimator for \/ip{u) which can be proven to satisfy 

(pn{u) Lp{u) 



b^cr 



i i |^(n)|2^|^(n)| 




for any positive threshold value k and all p G N. This is what we can at best expect from 
an estimator of \/^p{u). Using this idea we define our preliminary estimator of J-V(y{u) by 

(5-1) ^^Au) .= - 

where k is a positive constant. In Section [6.4| below we shall prove the following result. 
Proposition 5.1. We have '&b,v„{bn — 6)^ = 0{n~^) and for u G M 

^b,uA:FV„,n{^) - J^u,{u)\ = ON ^^Alj (1 + |^'(n)|2) j . 



This will give pointwise rates of convergence in a similar fashion as before and serves 
well as a starting point of a local optimisation routine. Note that this pilot estimator is 
very easy and fast to implement. Yet, it has certain drawbacks, most importantly J^a,n 
is usually not positive semidefinite so that z7o-,n is not necessarily a non-negative measure. 

In practice, our two-stage procedure works reasonably well. For a numerical example 
we simulate a Levy process {Xt)f^o with cr = 1, 6 = 1 and i^{dx) = x~-^e~^'I|^>o}^^- 
The process X is a superposition of an infinite-intensity Gamma process and a standard 
Brownian motion. The law of its increments Xf—Xt-i is the convolution of an A^(0, 1)- and 
an Exp(l)-distribution. We have n = 1000 observations, see Figure llfleft) for a histogram 
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Figure 1. Left: Histogram of the data. Right: modulus of the empirical 
(solid blue) and true (dashed orange) characteristic function. 



of the increments. The sample is rather disperse with some increments close to 10 and a 
sample mean of 6„ = 0.936 (true 5 = 1). The true characteristic function has Gaussian 
decay and its absolute value is shown together with that of the empirical characteristic 
function in Figure [Tj^right). 

We discretize the pilot estimate T'a,n of the jump measure by using a Haar wavelet basis 
on the interval [—10, 10] with 15 basis functions. Moreover, we allow for a point measure in 
zero to have a better resolution there. Its pilot mass is set to zero. Using the FindMinimum 
local optimisation procedure in Mathematica, we minimize the d^^-'-criterion locally around 
the discretized pilot estimator, constraining to non-negative Levy measures. In Figure 
[2]^left) we display for the given data the imaginary part of the empirical characteristic 
function together with the imaginary parts of the other characteristic functions of interest 
(true, pilot, final estimator). The errors in fitting the real part are less pronounced because 
there a less oscillations around zero (note (Re(/?)'(0) = 0). Typically, the pilot estimator 
gives already a reasonably good fit and the final estimator has a characteristic function 
which is closer to the empirical characteristic function than the true one. 

Figure |2]^right) finally shows the densities of the rescaled Levy measures Va, but sup- 
presses the point masses in zero. Note that the original Levy density and also its plug-in 
estimators have a singularity at zero because of ^{dx) = x~^fo-(dx) for x ^ 0. The pa- 
rameters are estimated as bn = 0.922 (true 6 = 1) and i><x,n({0})^/^ = 1.092 (true a = 1). 
The pilot estimator has no point mass in zero and its density is therefore large around 
zero. It is seen that the final estimator improves upon the pilot estimator, in particular by 
excluding negative values and catching the point mass in zero. Given 1000 observations 
and a Gaussian deconvolution problem, the estimation problem is quite hard. The rough, 
step-wise form of the final estimator is not so pleasant for the human eye, but we only 
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Figure 2. Left: Imaginary part of the empirical (dot-dashed blue), true 
(orange dashed), pilot (dotted green) and final estimated (red solid) char- 
acteristic function. Right: pilot (dotted green), final (red solid) estimator 
and true (dashed orange) density of the pilot estimator does not have 
a point mass in zero. 



want to use this estimator as an integrator of smooth functions and, as discussed above, we 
could apply a kernel to obtain a smooth density function. As an example for a functional 
to be estimated, we calculated x^'^v^idx) which estimates v{[l,oo)), the probability 
of jumps larger than one. In this sample, the true value 0.22 was estimated by 0.16. Let 
us remark that the high-frequency estimator, using the relative frequency of increments 
Xt — Xt-i that are larger than one, yields the estimate 0.46. The large error of the latter 
confirms a strong violation of the underlying high-frequency assumption that between two 
observations very rarely more than one larger jump occurs and that the diffusion part is 
negligible. Hence, the frequency of the observations must indeed be considered as low for 
the construction of the estimator. 



6. Proofs 



6.1. Proof of Theorem 4.1, We begin the proof with a few definitions. Given two 
functions /,n : M — > M the bracket [/,n] denotes the set of functions / with I ^ f ^ u. 
For a set G of functions the L^-bracketing number N^^{e,G) is the minimum number of 
brackets [li,Ui\, satisfying E[(uj(Zi) — Zj(Zi))^] ^ e^, that are needed to cover G. The 
associated bracketing integral is defined as 

J[](<5,G) = ^log{N[]{e,G))d8. 

Furthermore, a function / is called envelope function for G, if |/| ^ / holds for all f £ G. 
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To apply Corollary 19.35 from van der Vaart (1998), we decompose Cn in its real and 
imaginary parts, 



Re(Cn(n)) 



Im(C„(u)) 



n 



n 



^1/2 



(cos(MZt) — Ecos(nZi)) 



-1/2 



(sin(uZt) — Esin(uZi)) . 



t=i 



Accordingly, we consider the following class of functions: 



u G 



u G 



l| U I z ii;(n)|^ sin(uz) 

An envelope function ff. for is given by = jx]'^. Now we obtain from Corollary 19.35 
in van der Vaart (1998) that 



(6.1) E||C(^)|U^(^) ^ C|e(/,(Zi))2 + J„(^EZf ,Gfc) 

Since KZf^ < cxd it remains to bound the bracketing integral on the right-hand side of 
(6.1). Inspired by Yukich (1985), we proceed by setting, for every e > 0, 

M := M(e, k) := inf {m > | E[Z^%\z,\>m}] ^ e^} ■ 
Furthermore, we set, for grid points Uj G M to be specified below, 

afiz) = (^W{uj)£^ COs{ujz) ± e\z\''^l[_M,M]{z) ± \\w\\oo\z\%-M,M]''(^)^ 

hj{z) = [w{uj)-^sm{ujz) ±e\z\ jI[-M,M](^) ± IklUkl 1I[-m,m]<:(^)- 
We obtain for the width of the brackets that 



E 



gliZi) -gjiZi) 









^ E 



^4e^ (^EZf + \\w,^,, 
h+iZ,) - hj{Z,)) 



and, analogously. 



E 



^ 4e^ [¥.Zl^ + \\w\\l^ 



It remains to choose the grid points Uj in such a way that the brackets cover the set Gk ■ 
We consider an arbitrary u G M and any grid point Uj. Then with the Lipschitz constant 
Lip(zi') of the weight function w 

w{u)£j^cos{uz) - w{uj)^cos{ujz) 

^ I zl'^ min{|n — tij I (Lip(?i;) + ||t(;||oo|^;|), w('u) + w{uj)}. 

Therefore, the function z i— > w{u)-^ cos{uz) is contained in the bracket [gj jOj'] if 
min{|u — nj|(Lip(t(;) + ||t(;||oo-/Vf), tt;(u) + w{uj)} ^ e. 

Consequently, we choose the grid points as 



Uj = je/(Lij){w) + ||'u;||ooM(e, fc)), 
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for |j| ^ J{£:), where J(e) is the smallest integer such that is greater than or equal 
to 

U{e) = inf < li > I sup w{v) ^ e/2 > . 

[ v:\v\^u J 

This yields the estimate -/V[](e, G^) ^ 2(2 J(e) + 1). It follows from the generalized Markov 
inequality that 

M{e,k) ^ (E[\Zi\^'^+^]/e^y^\ 
Now we obtain from the inequality 

J{e) ^ 2U{e){Up{w) + ||u'||ooM(e, A;))/e + 1 

that log(iV[](e,Gfc)) = 0(log(J(e))) = 0(e-(^+i/2)"' + iog(e-i-2/7)) = 0{e-^) for n = 
{S + 1/2)-'^ <2. This implies 



^log{N[]{e,Gk)) de < oo, 
as required. □ 



u 



6.2. Proof of Theorem 4.2[ To simplify the notation, we use the abbreviations 
^{u;bn,i'a,n) and (/?„,(n) = exp(*„(w)). 

First of all, we obtain from the triangle inequality that 

(6.2) < 2d^^\^n,'p) + Sn. 

Proof for bn 

We have that (p'{0) = ib and 9J^(0) = ibn- Therefore, we obtain from (6.2 ) and Theorem 
that 

^b,.Jbn - b\=Eb,,J^'^{0) - 

^2E6,,^(i(2)(^„,^) + 5n = 0(n-i/2). 

Proof for l)a,n 

We consider the following set of "unfavorable" events: 

An := R,n(M) > u^{R) + l}u||6„| > |6| + l} . 
From ip' (0)2 - ip"(0) = and the analo gous formula for (pn it follows that 

IV^R) - MR)\ = |(^'(0)2 - ^;(0)2) - {^"{0) - 

(6.3) ^ (2|'^'(0)| + d(2)((^, (^„) + l)d(2)(^, (^„), 



4.1 
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Consequently, the (generalized) Markov inequality yields 

Pb,uAAn) ^ Eb,^^[\V^,n{R) - U^{R)\ A 1] + - b\] 

'2\b\ + S^\ip, ifn) + l) d^'^H^, ^n)) A 1 



(4|6| + 2)d(2)(c^,(^„) 
^ (4|6| +3) (E,,,Jd(2)(^„,c^)] + 6n) = 0(n-i/2) 
which implies that 



sup 

f<^Fs 



f du„,r. 



fduo 



.(P,,„(]R) + z.,(M)) 



(6.4) 



^ sup 

^ 2 M^)'^A„ + {2\^'{0)\+d^^\ip,ipn) + l)d^^\^,ipr, 



It remains to analyse the loss under It follows from Parseval's identity that 



f du^^n - I fdi/a 
2^ 



(6.5) 



1 

2^ 



■V'n(t^)' 



du. 



The differences occurring in the integrand on the right-hand side of (6.5 ) can be estimated 
using ^'/^ = ^'J^n = K- 



(6.6) 



ip{u) 



\Kin) + f'(n)| 



LPnju) - ip{u) 



and 



(Pn{u) ip{u) 



(Pn{u) - ip{u) 



(6.7) 



(Pnju) - (f{u) 



+ 



ip{u) 



+ 



Lp{u) 



Note that the following estimates hold true under A*^: 



(6.8) 
(6.9) 



^ \hn\ + |n|^?<,,„(M) ^ |6| + 1 + |n|(z.,(M) + 1), 
K(tx)| ^ \J'V,^n{u)\ ^ 5?,,„(R) ^ i/,(M) + 1. 
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Hence, we obtain from (6.5) to (6.9) and the trivial estimate |^z/o-,n(u) — !Fva{u)\ ^ 



i/o-,n(IR) + i^o-(I^) that under with some constant C > 0, 



POO 

^ c / \:Ffiu)\ 



(1 + |u|)2 A 1 



(1 + \u\fd(^\^n,^) 

w{u)\(f{u)\ 



^ C r {1 + \u\y\J^f{u)\du, sup \{1 + \u\)-' I 
J-oo um [ y 

^ C sup((l + u)-^ f^^^ff^ A l]] (nV2d(2)(^„,^) + i) 
u^o \ w{u)\(p{u)\ / V J 



A 1 



By monotonicity of {1 + u) * we can replace the supremum over [0, oo) by the supremum 
over [0, Un] and we arrive at 



Eh 



sup 



(6.10) 



O sup Ul + u) 

\ue[o,u„] I 



fd9a,n - I fdi^c 

(1 + u)2n-i/2 



w{u)\ip{u)\ 



A 1 



Together with the bound (6.4) on the set this yields the asserted general estimate. 



Tracing back the constants, we see that they depend continuously on |6| and UaiM)- 
Proof of the rate results (a), (h) 

(a) Under the condition log|(/9(n)| = — fT^n^/2(l + o{u)) we have Un ^ \/\ogn and we 
obtain the rate = (logn)"'^/^. 

(b) If ^ Ce~°", then we have Un x logn and we obtain the rate U~^ = (logn)"'^. 

Proof of the rate result (c) 

The same reasoning as for cases (a) and (b) would only yield the rate 
((logn)^/^+^n^/2)~'*/(^+^) for s £ (0,/3 + 2] and the parametric rate for s > /? + 2. In 
the polynomial case (c), though, better estimates for |^^(ii)| hold, i.e. we can improve 



upon (6.8). First, we formulate and prove a lemma for |^''(n)|. 



Lemma 6.1. If a Levy process with a finite first moment has a characteristic function (at 
time t = 1) satisfying \ip{u)\ ^ C(l + for some /? ^ 0, C > and all u G M, then 

/[-I +1] l-^l"^!*^-^) finite for all a > and the derivative of its characteristic exponent is 
uniformly bounded: 

sup|^'(ii)| < oo. 



Proof of Lemma \6.1\ Since we have necessarily cr^ = in the Levy-Khinchine charac- 
teristic as well as J^__-^ \x\ v{dx) < oo from the first moment condition, the additional 



18 



property J^_i ^i-^\x\i'{dx) < oo implies 

sup|^''(n)| = sup ib+ / (e*"^ - ^ \b\ + 2 I \x\i^{dx) < oo. 



It therefore remains to prove the first result for any a > 0. We obtain with c := 



mm. 



we 



[1,2] (1 - cos('u)) > 0: 



'^|s|^2-"+i} 



\x\°'i'{dx) 



c-^(l -cos(2"x))i/((ix) 



oo „ 

^ 7{x:2-"^|x|^2-"+i} 
oo 

^ c-^ ^ 2-°("-^) Re(-^(2")) 

n=l 
oo 

^ ^ 2-"("~^) (log(C-i) + /31og(l + 2")) . 



n=l 



This latter series is obviously finite. 



□ 



Resuming the proof for case (c), we remark that ^ C{1 + \u\) ^ implies for any 

U>0 

Pb,u^ (3n e [-U,U] : |^„(n)| < ^(1 + 

^ Pb,.. ( sup \<fn{u) - f{u)\{l + ^ C/2 ) 

^ |e[||v.„ - (^|U.c(^)]^i;([/)-l(l + = Oin-'/MU)-\l + Uf). 
Consequently, for C/„ ^ oo with w{Un)~^ui = o(n^/^) we have 



(6.11) 



hm^Pb,,^ ( Vn G [-Un,Un] : |^n(n)| ^ ^(1 + \u\)-^ ) = 1; 



in the sequel we shall work with Un = n^/^^'^^(log n) i^/'^+'^^yP, Theorem 



4.1 



Lemma 



6.1 



and Equation (6.11) then yield 



sup !<(.)- nu)| ^ sup 



\'Pn{u)\ 



\fn{u)\ 



(6.12) 



Op{n-'/^)w{Un)-'-^{l + \Un\f. 



Together with Estimate (6.12) and again Lemma 6.1 we have thus established for 
re ^ oo 



(6.13) 



sup K(u)| = Op(l + n-'/MUnr'\Unf) = Op{l). 
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We therefore get instead of (6.10) the estimate 



sup 



sup < (1 + \u\ 



sup s (1 + \u\ 



w{u)\^p{u)\ 



-1/2 



-1/2 



A 1 Op(l). 



\^(log(e+ |m|))-i/2-5(i + |^,|)-/3 
For s ^ (3 the right-hand side is of order Op{U^^) and we obtain 

while for s > j3 the parametric rate Opin^^/"^) follows. 



sup 



Op (n-^/2/^(logn)^(i/2+25)M, 



□ 



6.3. Proof of Theorem 4.4 , The lower bound will be established by looking at a decision 
problem between two local alternatives, see e.g. Korostelev and Tsybakov (1993) for the 
general idea. For 7 > and /? > consider the bilateral Gamma distribution which 
is obtained as the law oi X — Y where X and Y are independent and both r(7,/3/2)- 
distributed. This bilateral Gamma distribution is infinitely divisible with the following 
characteristic function and Levy triplet: 



ipT{u) := (l+7-2n2) 



2„.2^-/3/2 



6r = 0, err = 0, VT{dx) := [3\x\-^ e'^^''^ dx . 



Its density /r satisfies fr{x) ^ ce ''''^1 for some c > (Kiichler and Tappe 2008). For 
(7^0 consider the infinitely divisible distribution with characteristic function 



(6.14) 



^o(^) := <^r(n)e-^-'^'-'/2, 



which has a density /o that is a convolution of /r with a normal density and therefore 
still satisfies fo{x) ^ ce"'^'^' with some c > 0. The corresponding Levy density satisfies 
uq = ur. 

Let us further introduce for K > and p > 

Pk{x) ■.= e-''^'^^P^hm{Kx). 

For any /3 > and 7 > we can choose p sufficiently small such that uq^x) + pk[x) ^ 
holds for all if > 0. In this case the following characteristic function also generates an 
infinitely divisible distribution: 

ifK{u) := 93o(^t)exp( / (e'"^ - 1) /iii'(dx) ) = (/7o(^J-) exp(J^/ix (■"))• 
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Using the fact that s'm{K x) sm{ux) = {cos{{K — u)x) — cos[[K + u)x))/2 we obtain the 
following explicit calculation of the Fourier transform of ^k'- 



e sin(Erx) sin(tix) dx 

e-^V(V) cos{{K - u)x) dx - i 
ip^2 (eV(/<-)V2_^V{i^+«)V2 



-^V(V) 



cos((i<' + u)x) dx 



Note that (px has the same decay behavior as 999 due to lim|„|^oo T^XKiu) = 0. Therefore 
1/0,0- and vk,ct he in the class A{C^ cr) (cr > 0) or C(C, (7, /3) (cr = 0), respectively, provided 
C, C* are large enough. 

Let us now estimate the x^-distance between the distributions with characteristic func- 
tions and (/jq: 

(/i^(x)-/o(x))2 



x'UkJo) 



dx 



(6.15) 



^ c 



/o(x) 

71 (e'''''/'/x(x)-e^N/Vo(x))'ci:r 
''^/^fK{x)-e''^/^fo{x)y dx 
[e-^^'/'Mx) - e-^^/^fo{x)y dx\ . 



00 
00 



For functions g whose Fourier transform can be extended holomorphically to complex 
values z with |Im(2;)| < 7 we have: 

jr J^g±7^/2^(^)^ ^ J g^x)e^in±^/^'>''dx = J^g{u ± {-i)l/2). 

Using this identity in Plancherel's formula and then the estimate — 1| ^ |z|el^'^(^)l, 
z G C, together with \J^pk{u)\ ^ H^/tHlIj continue from (6.16): 



^-1 />oo 

— 1 /"CJO 



27r 





n/2) - 




(n- 


3 










+ ^ + 






4 


T 


7 





i7/2)|2 + |v9;^(u + i-i/2) - ifoiu + i7/2)P) du 



p2||AtA'lli,i r°o 



2c7r 



=2||aja'|Ii,i n2 



3 ^^-^ 

4 + 72 



4c7r 



3 n 
4 + y 



(|.F^i^(u-n/2)P + \J'^MK{u + i^/2)\'^) du 

2\ -/3 



The last line is for ^ 00 of order e-'"'"'(l + u2)-/3(e-^'("--^)' +e-/''("+^)') dn. In 
the case cr = (polynomial decay) this gives the order K~'^^, whereas for cr > (Gaussian 
part) the order is e-'^'^'(i+°(i)). 
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For n observations the distributions do not separate provided K^"^^ >c ((7 = 0) 
and e~'^^^^^^^°^^^^ x (cj > 0), respectively. Consequently, when choosing Kn n^^"^^ 
((7 = 0), respectively Kn = Cy^log{n) with c > sufficiently large ((7 > 0), this closeness 
of the distributions implies (Korostelev and Tsybakov 1993) that for any sequence of 
estimators (i'a,n)n we have 

lim^inf |Po(4(z?f7,n,J^O,(7) ^ 4(i^X„,<7, l^0,n)/2)+P/^„ (4(j?a,n, I^O.a) ^ ^s(^'i^„,cr, l^0,n)/2) | > 0. 

It remains to consider the loss is between the alternatives. Using the formula 
Jc-(x2e-^'/(V))(u) = p3(i _ p2^2)g-p2„2/2^ calculate: 



sup 

f&Fs 
1 
27 
1 

feFs 



27r /sF, 



Setting e := liminf^ 



/OO 
-OO 

Im((.^/*.F(x2e-^'/2p2))(K)) 

Im(.F/(x))p=^(l - p'^iK - n)2)e-'''(^-")'/2 du 

— p3 sup + _ ^2(^ _ ^)2|g-p2(/^-«)V2| 

2vr L J 

3o Knis{i^K,a, t'o,o-)/2 > 0, we have thus shown 
liminf supP5,i,^(il'^4(i'(T,n, i^ct) ^ e) > 0. 



For (7 = (polynomial decay) this gives the desired lower bound = n~^/^'^^^ for any 
/3 > and for s ^ (3. For s > /3 a standard parametric argument shows that the minimax 
rate is never faster than n"^/^. For (7 > (Gaussian part) we obtain the lower bound 
= (logn)~''/2, which matches exactly the upper bound. 
In the case (b), i.e. where |93(u)| ^ Ce""'"', we consider instead of (6.14) 

ipQ{u) = ipr{u)ipa{u), 

where ipa is an infinitely divisible characteristic function with |93(n)| x e""^'"' such that 
the corresponding density function has faster exponential decay than /q. For example, 
a tempered stable law (Cont and Tankov 2004, Prop. 4.2) with v{dx) = a\x\~'^ e~^^^^ dx 
and A > sufficiently large meets these requirements. The remaining steps of the proof 
are exactly the same, just replace 6"°"^"^/^ by e""'**'. □ 



6.4. Proof of Proposition 5.1 Note first that 'Kb,i^„\bn — = 0{n ^) follows directly 
from EXf < OO. 

To prove the result for the jump measure, we distinguish between two cases. We set 

^n(^^) = ^n{u)/^n{u) and $'^(u) = $'^{u)/^n{u) - {^'M / lpn{u)f ■ 



Case 1: \ip{u)\ ^ 2Kn-^/2 
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It follows from (6.6) and (6.7) that 



(pn{ u) - Lp{u) 
Lp{u) 
ipn{u) - ip{u) 



(6.16) + 



say. 



ip{u) 

+ |^1^<t(u)|I||^,^(„)|<^„-i/2| 

We obtain from the inequality 



+ 



-{|(p„(«)|^Kn-i/2} 



ip{u) 



{|<^„(«)|^Kn-i/2} 



2{|<^„(«)|^Kn-i/2} ^ 1 ^ \(pn{u) - ip{u) 



that 
(6.17) 



E 



holds for all p G N. This implies, by ^''^.(ti) = (<?n(^) - {u)) / (pn{u) + 4''(ti)(/9(ii)/^„(n), 
that 



(6.18) 



E 



I 



{|.p„(«)|^Kn-l/2} 



Therefore, we obtain that 

(6.19) ET„,i = O 

Since 



n 



-1/2 



|^(n)| 



0((1 + |V(n)|)P) 



:i + i^'(n)i)' 



^n(w) 

^^(^) - V^^^(^) 
we obtain, in conjunction with (6.17) and ( 6.18[ ), that 



+ [V\u) + (^'(n))- 



2\ '^{u) 



E 



I 



We conclude that 
(6.20) 



-{|<^„(«)|^Kn-i/2} 



-1/2 



o (1 + m^)!)' . 



Finally, it follows from Hoeffding's inequality for bounded random variables that 

P{\$niu)\ < ^ P (l^niu) - ^iu)\ > \ipiu)\ - Kn-^/^) 

^ P il^niu) - ip{u)\ > Mu)\/2) 

^ exp(-c n |99(u)p). 
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for some c > 0. This yields that < nn '^/'^) = 0(n ^/^|(/?(n)| and therefore 



(6.21) 



ET„^.3 = O 



Equations (6.16), (6.19), (6.20), and (6.21) yield the desired bound in the case 



\ip{u)\ ^ 2Kn-i/2. 

Case 2: \^{u)\ < IktT^I'^ 

In contrast to Case 1, this time we use the following decomposition: 



(6.22) + 



(pn{u) 



+ |^1^<t(u)|I||^,^(„)|<^„-i/2|. 



-{|<^„(«)|^Kn-i/2} 



{|<p„{«)|^Kn-l/2} 



Taking into account that ^" is bounded and using again (6.18) as well as the trivial 
estimate \ J-Vfj{u)\ ^ Vai^) < oo we obtain that 



E |.FI7,,„(n) - J^y,{u)\ = 0((1 + l^'(n)l) 



as required. 



□ 



. Acknowledgment. We thank Peter Tankov for the idea how to prove Lemma 6.1 and 
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